Transaction / Regular Paper Title
نویسندگان
چکیده
Most of the conventional feature selection algorithms have a drawback whereby a weakly ranked gene that could perform well in terms of classification accuracy with an appropriate subset of genes will be left out of the selection. Considering this shortcoming, we propose a feature selection algorithm in gene expression data analysis of sample classifications. The proposed algorithm first divides genes into subsets, the sizes of which are relatively small (roughly of size h ), then selects informative smaller subsets of genes (of size h r ) from a subset and merges the chosen genes with another gene subset (of size r ) to update the gene subset. We repeat this process until all subsets are merged into one informative subset. We illustrate the effectiveness of the proposed algorithm by analyzing three distinct gene expression datasets. Our method shows promising classification accuracy for all the test datasets. We also show the relevance of the selected genes in terms of their biological functions.
منابع مشابه
Transaction / Regular Paper Title
This paper investigates the use of distributed processing on the problem of emotion recognition from physiological sensors using a popular machine learning library on distributed mode. Specifically, we run a random forests classifier on the biosignal-data, which have been pre-processed to form exclusive groups in an unsupervised fashion, on a Cloudera cluster using Mahout. The use of distribute...
متن کاملTransaction / Regular Paper Title
Constructing models of dynamic systems is an important skill in both mathematics and science instruction. However, it has proved difficult to teach. Dragoon is an intelligent tutoring system intended to quickly and effectively teach this important skill. This paper describes Dragoon and an evaluation of it. The evaluation randomly assigned students in a university class to either Dragoon or bas...
متن کاملTransaction / Regular Paper Title
Approximate Nearest Neighbor (ANN) search has become a popular approach for performing fast and efficient retrieval on very large-scale datasets in recent years, as the size and dimension of data grow continuously. In this paper, we propose a novel vector quantization method for ANN search which enables faster and more accurate retrieval on publicly available datasets. We define vector quantiza...
متن کاملTransaction / Regular Paper Title
Valid-time indeterminacy is “don’t know when” indeterminacy, coping with cases in which one does not exactly know when a fact holds in the modeled reality. In this paper, we first propose a reference representation (data model and algebra) in which all possible temporal scenarios induced by valid-time indeterminacy can be extensionally modeled. We then specify a family of sixteen more compact r...
متن کاملTransaction / Regular Paper Title
With the shifting focus of organizations and governments towards digitization of academic and technical documents, there has been an increasing need to use this reserve of scholarly documents for developing applications that can facilitate and aid in better management of research. In addition to this, the evolving nature of research problems has made them essentially interdisciplinary. As a res...
متن کاملA First Step Towards Implementing Dynamic Algebraic Dependencies
We present a class of dynamic constraints (DADS) which are of practical interest and allows one to express restrictions such as if some property holds now, then in the past some other property should have been true. The paper investigates in a constructive manner the definition of transaction-based specifications equivalent to DAD-constraint-based specifications. Our study shows the limitation ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011